Ontology-Based File Naming Through Hierarchical Conceptual Clustering
نویسندگان
چکیده
Current directory-based hierarchical file systems have many limitations as the amount of unstructured data possessed by individual user is increasing continuously. One of the most significant problems is that users usually have difficulties searching, navigating, and organizing their files since useful semantic information describing a file is not used in the current directory-based system. To solve this problem, several research groups have suggested attribute-based file naming systems. However, their approaches have not been widely used because of lack of semantic information. In this paper, we describe the ontology-based semantic file naming approach that employs the hierarchical conceptual clustering technique to capture more complex semantic information from the set of file attributes. Ontologies, which play a major role on the Semantic Web, describe the semantics of data by organizing data into taxonomies of concepts and describing the relationships between concepts. To generate the ontology from the set of attribute-value pairs for files, we first extend one of the standard incremental hierarchical clustering techniques, COBWEB, and suggest the new clustering evaluation measure to guide search through the space of clustering. From the clustering result, we then generate the ontology and represent it by the RDF Schema. Our experimental results show that our extended clustering approach can produce a good quality of the concept hierarchy, and is computationally efficient and well suited to building the ontology-based semantic file system.
منابع مشابه
A File Naming Scheme Using Hierarchical-Keywords
In this paper, we propose a file naming scheme, called HK (Hierarchical-Keyword-based) naming. In file systems, hierarchical naming has been used for these several decades. As the number of files stored in file systems increases, the weakness of hierarchical naming is getting recognized. Some researchers have proposed hybrid naming schemes which introduce attribute-based naming into hierarchica...
متن کاملComparing Conceptual, Divise and Agglomerative Clustering for Learning Taxonomies from Text
The application of clustering methods for automatic taxonomy construction from text requires knowledge about the tradeoff between, (i), their effectiveness (quality of result), (ii), efficiency (run-time behaviour), and, (iii), traceability of the taxonomy construction by the ontology engineer. In this line, we present an original conceptual clustering method based on Formal Concept Analysis fo...
متن کاملComparing Conceptual, Divisive and Agglomerative Clustering for Learning Taxonomies from Text
The application of clustering methods for automatic taxonomy construction from text requires knowledge about the tradeoff between, (i), their effectiveness (quality of result), (ii), efficiency (run-time behaviour), and, (iii), traceability of the taxonomy construction by the ontology engineer. In this line, we present an original conceptual clustering method based on Formal Concept Analysis fo...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملContextual Hierarchy Driven Ontology Learning
Research in ontology learning had always separated between ontology building and evaluation tasks. Moreover, it had used for example a sentence, a syntactic structure or a set of words to establish the context of a word. However, this research avoids accounting for the structure of the document and the relation between the contexts. In our work, we combine these elements to generate an appropri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004